Method and apparatus for depth ordering of digital images

ABSTRACT

In a method for relative depth of parts of one or more digital images, the digital images are regularized by segmentation, and at least part of the pixels of the images are assigned to respective segments. The realtive motion of the segments for successive images is estimated by image matching. The image features of the segments are regularized by dual segmetation, in which the edges of the segments are found, pixels are assigned to the edges, and dual segments are defined. The relative motion of the dual segments for successive images is estimated by image segment matching in order to determine the relative depth order of the image segments.

The present invention relates generally to the art of video and imageprocessing. It particularly relates to depth ordering within frames of avideo sequence based on motion estimation and will be described withparticular reference thereto.

For various video sequence processing applications, the motion or thedepth order of parts of an image need to be found. Such applicationsinclude, for example, scan-rate up-conversion, MPEG coding, andmotion-based depth estimation, and many of these applications requirecomputational simplicity. Known methods of motion estimation are basedon a matching approach. With such a method, each video frame ispartitioned into segments. Then, for each element of the partition (or:segment), a motion vector is estimated such that the amount ofdissimilarity or “match penalty” between the shifted version of thatsegment in the current frame and its location in the following frame isminimized.

More particularly, in known methods of motion estimation andmotion-based depth estimation, a motion vector Δx=(Δx,Δy) or a depth dis assigned to a part of the image as a result of minimizing a matcherror E over a limited set of candidate motion or depth values. It isassumed that the candidate values sample the graph of E as a function ofthe depth d or motion vector Δx sufficiently dense. Moreover, it isassumed that this graph has a sufficiently prominent global minimum.

While the basic algorithm partitions the image into square blocks,(recent) research has been devoted to partitioning the image intoregions with arbitrary geometry, so-called segments, where the segmentboundaries are aligned with luminosity or color discontinuities. In thisway, segments can be interpreted as being parts of objects in the scene.This can improve the resolution and accuracy of the motion or depthfield.

In the typical process of segment-based depth reconstruction out ofvideo sequences, two processing steps are performed after having found amotion vector per segment. The first step is camera calibration, whichresults in the camera position and orientation. The second step is depthestimation from two subsequent frames, resulting in a per pixel depthestimate. These processing steps may be integrated.

In this depth estimation algorithm, camera calibration is required toenable the conversion of an apparent motion to a depth value. Cameracalibration relates to the internal geometric and opticalcharacteristics of the camera and the 3-D position and orientation ofthe cameras frame relative to a certain world coordinate system. Cameracalibration is, however, an unstable procedure. Moreover, currenttechnology for the conversion of motion to camera parameters and depthcan only be done if a scene is static. Thus, the known depth estimationalgorithms are of limited use if there is not much depth difference inthe scene or when objects have their own motion relative to theremainder of the scene.

Further, it is known that depth order may be derived by comparing themotion of a region with the motion of its boundary. Recent methods havetried to solve this segmentation and depth ordering problemsimultaneously. One such method is to locate regions and edges in theimage, partition the edges into sets, and label the regions, asdescribed in “Edge Tracking for Motion Segmentation and Depth Ordering,”P. Smith, T. Drummond, R. Cipolla, Proceedings of the British MachineVision Conference, Vol. 2, Pages 369-378, September 1999. Another suchmethod is color segmentation and motion estimation, motion assignment,motion refinement, and region linking, as disclosed in “IntegratedSegmentation and Depth Ordering of Motion Layers in Image Sequences,” D.Tweed and A. Calway, Proceedings of the British Machine VisionConference, pages 322-331, September 2000.

However, the two methods mentioned above have limited applicabilitybecause in the first, only two depth layers are feasible, and in bothmethods a rather complicated global optimization is used.

The present invention is different in that it operates locally andcompares the match error between region pairs to obtain a depthordering. It represents an improvement in that it is based solely on themotion vectors, which does not require camera calibration, and it isvalid for any number of depth layers. Further, no threshold isintroduced.

According to one aspect of the invention, an apparatus for depthordering of parts of one or more images, based on two or more digitalimages, is provided. An input section is provided for receiving thedigital images. A first regularization means is provided forregularizing image features of the digital images, composed of pixels,by segmentation, and includes an assigning means for assigning at leastpart of the pixels of the images to respective segments. A firstestimating means is provided for estimating relative motion of thesegments for successive images by image matching. A secondregularization means is provided for regularizing image features of thesegments by dual segmentation and includes a means for finding the edgesof the segments, an assigning means for assigning pixels to the edges,and a means for defining dual segments. A second estimating means isprovided for estimating relative motion of the dual segments forsuccessive images by image segment matching to determine relative depthorder of segments of the images. An output section is provided foroutputting relative depth ordering of parts of the images.

According to another aspect of the invention, a method for depthordering of parts of one or more images using two or more digital imagesis provided. Image features of the digital images, which are composed ofpixels, are regularized by segmentation, and at least parts of thepixels of the images are assigned to respective segments. The relativemotion of the segments for successive images is estimated by imagematching. The image features of the segments are regularized by dualsegmentation, which includes finding the edges of the segments,assigning pixels to the edges, and defining dual segments. The relativemotion of the dual segments for successive images is estimated by imagesegment matching to determine relative depth order of parts of theimages.

One advantage of the present invention resides in improving the mannerin which relative depth order of digital images from successive framesin a video sequence is determined.

Another advantage of the present invention resides in being able todetermine relative depth order without requiring camera calibration.

Yet another advantage of the present invention resides in being able todetermine relative depth order for more than two depth layers in adigital image.

Yet another advantage of the present invention resides in improving theaccuracy of the motion vector estimate.

Numerous additional advantages and benefits of the present inventionwill become apparent to those of ordinary skill in the art upon readingthe following detailed description of the preferred embodiment.

The invention may take form in various components and arrangements ofcomponents, and in various steps and arrangements of steps. The drawingsare only for the purpose of illustrating preferred embodiments and arenot to be considered as limiting the invention.

FIG. 1 illustrates an example of a process for depth ordering of partsof digital images based on motion estimation.

FIG. 2 illustrates an example of an original segmentation of a portionof a frame from the Doll House sequence.

FIG. 3 illustrates an example of a dual segmentation of a portion of aframe from the Doll House sequence.

FIG. 4 illustrates an example of an original segmentation of a portionof a frame from the Dionysios sequence.

FIG. 5 illustrates an example of depth ordering of a portion of a framefrom the Dionysios sequence.

FIG. 6 schematically shows a device for depth ordering of parts ofdigital images.

In the following preferred embodiment, a process for determining depthorder relationships of parts of digital images is explained. Theseimages can be subsequent images from a video stream, but the depth orderprocess is not limited thereto.

With reference to FIG. 1, a process 10 depth orders parts of images 20within a frame. A first step 30 of the process 10 is segmentation of theimages 20 in the frames. A second step 40 is determining matchingsections in subsequent segmented images from the video stream. A thirdstep 50 is dual segmentation of the images 20. A fourth step 60 isdetermining the motion of dual segments of the image through imagesegment matching. An output 70 is relative depth orders of the parts ofthe images 20.

The images 20 are digital images consisting of image pixels and definedas two 2-dimensional digital images I₁(x, y) and I₂(x, y), wherein x andy are the coordinates indicating the individual pixels of the images.The process 10 includes the calculation of a pair of functions: M=Δx(x,y) and M=Δx(x, y). M is defined such that every pixel in the image I₁ ismapped to a pixel in image I₂ according to the formula:I ₂(x,y)=I ₁(x+Δx(x,y), y+Δy(x,y)).The construction of M is modified by redefining M as a function that isconstant for groups of pixels having a similar motion.

A collection of pixels for which M is said to be constant is composed ofpixels that are suspected of having a similar motion. To find suchcollections, the images 15 are divided into segments by means of thesegmentation step 30. Image I₁ is thus divided into segments consistingof pixels that are bounded by borders, which define the respectivesegments. Segmentation of an image amounts to deciding, for every pixelin an image, the membership to one of a finite set of segments, where asegment is a connected collection of pixels. Image segmentation methodscan be generally divided into feature-based and region-based methods.With respect to the depth ordering process 10, the type of imagesegmentation used should, at a minimum, identify the motiondiscontinuities. It is assumed that motion and color discontinuitiescoincide, which means that the segmentation algorithm preferably putssegment borders at color boundaries. However, it may also put segmentboundaries elsewhere. As this is one of the major purposes of imagesegmentation, the particular choice of color-based image segmentationalgorithm is not crucial to the present depth ordering process. FIG. 2shows a frame from the Doll House sequence that has undergone colorboundary segmentation.

The second step 40 of the process 10 is image matching, or segment-basedmotion estimation. More particularly to the preferred embodiment, thesecond step 40 includes a determination of the displacement function Mfor a segment between image I₁ and image I₂, whereby a projection of thesegment in the image I₂ needs to be found that matches the segment toproduce M. This is done by selecting a number of possible matchcandidates of image I₂ for the match with the segment, calculating amatching criterion for each candidate, and then selecting the candidatewith the best matching result. The matching criterion is a measure ofthe certainty that the segment of the first image matches with aprojection in the second image. To determine which of the candidateprojections matches best with the segment, a matching criterion iscalculated for each projection. The matching criterion is used indigital imaging processing and is known in its implementation asminimizing a matching error or matching penalty function. Such functionsand methods of matching by minimizing a matching function are known inthe art.

Accordingly, with a segment and a candidate motion vector the locationof the pixels of the segment in the next image is predicted. Thus, inthe second step 30, a comparison is made of the predicted pixel colorswith the actual colors observed in the second image. The differencebetween the predicted and the actual colors is summarized and called thematch penalty or “SAD error.” (SAD is an acronym for the Sum of AbsoluteDifference.) Finally, the candidate motion vector which has the smallestmatch penalty is assigned to each segment. To do this efficiently, smartchoices for the candidate motion vectors are preferably made (forinstance, the optimal motion vector of a neighboring segment), but thisaspect is not crucial to the invention.

The third step 50 in the depth ordering process 10 is the defining of adual segmentation for each image. As stated earlier, segmentation of animage amounts to deciding for every pixel in the image, the membershipto one of a finite set of segments, where a segment is a connectedcollection of pixels. A particularly advantageous method of the dualsegmentation is the so-called “quasi segmentation” method. In the quasisegmentation method, so called “seeds” of segments are grown by means ofdistance transform such that at least parts of the pixels are assignedto a seed. This results in significantly decreased calculation costs andincreased calculation speeds. The quasi segments can thus be used inmatching of segments in subsequent images.

The dual segmentation step 50 consists of two components: finding theedges of the segments and assigning pixels to the segments. Thus, basedon the original segmentation, for each pair of segments (S_(i), S_(j)),all edge pixels are labeled with a number e_(ij), i.e., those pixels pfor which p ε S_(i) and ∃q ε N₄(p) such that q ε S_(i), and those forwhich p ε S_(j) and ∃q ε N₄(p) such that q ε S_(j), where N₄ denotes the4-neighborhood of p. The dual segment S_(ij) is now created, whereby theseed corresponds to the edge pixels e_(ij). A seed consists of seedpixels, wherein seed pixels are the pixels of the image that are closestto the hard border sections. The seeds form an approximation of theborder sections within the digital image pixel array; as the seeds fitwithin the pixel array, subsequent calculations can be performed easily.Seed pixels are defined all along the detected border between the twosegments, giving rise to two-pixel wide double chains. The chain of seedpixels along the border—in this case, both sides are part of the SAMEseed—is regarded as a seed and indicated by a unique identifier. As aresult of edge detection, the seed pixels essentially form chains. Seedscan also be arbitrarily shaped clusters of edge pixels, in particularseeds having a width of more than a single pixel. A distance transformgives, for every pixel (x, y), the shortest distance d(x, y) to thenearest seed point. Any suitable definition for the distance can beused, such as the Euclidean, “city block” or “chessboard” distance.Methods for calculating the distance to the nearest seed point for eachpixel are known in the art, and in implementing the process 10 anysuitable method can be used.

The algorithm that is used is in the preferred embodiment is based ontwo passes over all pixels in the image I(x, y), resulting in values ford(x, y) indicating the distance to the closest seed. The values for d(x,y) are initialized. In the first pass, from the upper left to lowerright of image I, the value d(x, y) is set equal to the minimum ofitself and each of its neighbors plus the distance to get to thatneighbor. In a second pass, the same procedure is followed while thepixels are scanned from the lower right to upper left of the image I.After these two passes, all d(x, y) have their correct values,representing the closest distance to the nearest seed point.

During the two passes where the d(x, y) distance array is filled withthe correct values, the item buffer b(x, y) is updated with theidentification of the closest seed for each of the pixels (x, y). Afterthe distance transformation, the item buffer b(x, y) has for each pixel(x, y) the value associated with the closest seed. This results in thedigital image being segmented; the segments are formed by pixels (x, y)with identical values b(x, y). Thus, part of the segments to both sidesof the edge form a dual segment. This aspect is best seen FIGS. 2 and 3,which feature a portion of a frame from the Doll House sequence.Depicted in these figures is an arch. In FIG. 2, the originalsegmentation, the arch consists of black and grey segments, which areseparated by the edge. In FIG. 3, a dual segmentation exists that ispartly in the black part, partly in the grey part, and consists of thosepixels that are closer to the edge between the two parts in the originalsegmentation than to any other edge in the original segmentation.

The fourth step 60 in the process 10 is to compute the match penaltiesfor each of the dual segments for two candidates. Each border of theoriginal segmentation gives rise to a segment in the dual segmentation.Since there is now a dual segmentation, image matching is once againundertaken. However, to make the process faster and more efficient inthis step, only two candidates for each border are used—the optimalmotion vector for the segments on both sides of the border. These arethe motion vectors that minimize the match penalty.

Thus, in the preferred embodiment, the two candidates for segment S_(ij)are the optimal motion vectors between the two or more images or framesfor the original segments S_(i) and S_(j). The corresponding matchpenalties are called M_(i) and M_(j). After the match penalties aredetermined, it is decided which segment is the closer one, or the output70. This task is accomplished by comparing M_(i) to M_(j). If M_(i) isless than M_(j), then S_(i) is the closer segment. Likewise, if M_(i) isgreater than M_(j), then S_(j) is the closer segment. Thus, thelikelihood that a correct determination has been made can be given interms of the difference M_(i)−M_(j.)

To explain why this improved depth ordering process 10 works, it isnoted that an edge is characterized by a relatively large color contrastrelative to the texture within a segment by the definition of thesegmentation. The edge (or the color contrast) has the same motion asthe closer segment: the edge belongs to that segment. For the farthersegment, pixels are included below the other segment, and the movementof the edge is not related to the movement of the segment. The matchpenalty is sensitive to the color contrast; thus, it will be lowest forthe motion vector that corresponds to the motion of the closer segment.

FIGS. 4 and 5 illustrate the results of the depth ordering method for aportion of a pair of frames of the Dionysios sequence at slightlyshifted camera positions. Depth contrasts are encoded in FIG. 5 asblack/white edges, where the light part is the upper side and the darkpart the lower side. The size of the contrast indicates the differencein match penalty, or the confidence in the depth ordering. It can beseen that the foreground and the background are ordered adequately.

As an alternative embodiment of the invention, it is possible to do fullimage matching (or motion estimation) for the dual segmentation and onlytest a limited number of candidates (e.g., the optimal motion vectors ofall the edges surrounding a segment) for the original segments.

One of the advantages of the depth ordering process 10 includes the factthat the extra computational expenses are relatively small. The dualsegmentation consists of a distance transform, which can be implementedas a two-pass operation over the digital image and only two candidatemotion vectors have to be evaluated for the segment. This can be madeeven cheaper by matching only in a small region (e.g., 4 pixels wide)around the edge and not for the full dual segment.

The depth order of segments may also be used in the RANSAC-based cameracalibration algorithm, where parameter estimates that are inconsistentwith the derived depth order can be discarded.

A computer program product including computer program code sections forperforming the above steps can be stored on a suitable informationcarrier such as a hard or floppy disc or CD-ROM or stored in a memorysection of a computer. It may also be directly implemented in specificor reconfigurable hardware.

With reference to FIG. 6, a device 100 for depth ordering of digitalimages includes a processing unit 120 for depth ordering of parts ofdigital images according to the method as described above. Theprocessing unit 120 includes a first regularization component 130 forsegmentation of the images, a first image matching component 140 forestimating motion of the segments, a second regularization component 150for dual segmentation of the images, and a second image matchingcomponent 160. The processing unit 120 is connected with an inputsection 110 by which digital images are received and put through to theprocessing unit 120. The processing unit 120 is further connected to anoutput section 170 through which the resulting relative depth order ofparts of the digital images is output. The device 100 may be included ina display apparatus 200, such as a 3-dimensional television product.

The invention has been described with reference to the preferredembodiments. Obviously, modifications and alterations will occur toothers upon reading and understanding the preceding detaileddescription. It is intended that the invention be construed as includingall such modifications and alterations insofar as they come within thescope of the appended claims or the equivalents thereof.

1. An apparatus (100) for depth ordering of parts of one or more digitalimages comprising: an input section (110) for receiving the digitalimages; a first regularization means (130) for regularizing imagefeatures of the digital images, composed of pixels, by segmentation,including assigning means (130) for assigning at least part of thepixels of the images to respective segments; a first estimating means(140) for estimating relative motion of the segments for successiveimages by image matching; a second regularization means (150) forregularizing image features of the segments by dual segmentation,including a means (150) for finding the edges of the segments, anassigning means (150) for assigning pixels to the edges, and a means(150) for creating dual segments; a second estimating means (160) forestimating relative motion of the dual segments for successive images byimage segment matching to determine relative depth order of the imagesegments. an output section (170) for outputting relative depth orderingof parts of the images.
 2. The apparatus (100) for depth ordering ofparts of one or more digital images as set forth in claim 1, wherein thedigital images include frames of a two-dimensional video sequence. 3.The apparatus (100) for depth ordering of parts of one or more digitalimages as set forth in claim 1, wherein the first estimating means (140)includes a defining means (140) for defining a finite set of candidatevalues wherein a candidate value represents a candidate for a possiblematch between image features of two or more images; an establishingmeans (140) for establishing a matching penalty function for evaluationof the candidate values; a selecting means (140) for selecting acandidate value based on the result of the evaluation of the matchingpenalty function.
 4. The apparatus (100) for depth ordering of parts ofone or more digital images as set forth in claim 1, wherein the dualsegments are defined by taking a pixel along the border of twoneighboring segments as seed pixels, and assigning parts of theremaining pixels to one of the seeds using a distance transformalgorithm.
 5. The apparatus (100) for depth ordering of parts of one ormore digital images as set forth in claim 1, wherein the secondestimating means (160) includes a calculating means (160) forcalculating optimal motion vectors for the dual segments; a computingmeans (160) for computing match penalties for the dual segments; aselecting means (160) for selecting a closer segment by comparing theoptimal motion vectors.
 6. A display apparatus (200) comprising theapparatus (100) as set forth in claim
 1. 7. A method for relative depthordering of parts of one or more digital images, the method including:providing one or more digital images; regularizing image features of thedigital images, composed of pixels, by segmentation, including assigningat least part of the pixels of the images to respective segments;estimating relative motion of the segments for successive images byimage matching; further regularizing image features of the segments bydual segmentation, including finding the edges of the segments,assigning pixels to the segments, and defining dual segments; estimatingrelative motion of the borders of the dual segments for successiveimages by image segment matching to determine relative depth order ofparts of the images.
 8. The method for depth ordering of parts of one ormore digital images as set forth in claim 7, wherein the digital imagesinclude frames of a two-dimensional video sequence.
 9. The method fordepth ordering of parts of one or more digital images as set forth inclaim 7, wherein estimating the relative motion of the segments includesdefining a finite set of candidate values wherein a candidate valuerepresents a candidate for a possible match between image features oftwo or more images; establishing a matching penalty function forevaluation of the candidate values; selecting the candidate value basedon the result of the evaluation of the matching penalty function. 10.The method for depth ordering of parts of one or more digital images asset forth in claim 7, wherein the dual segmentation is achieved by meansof quasi segmentation, where for each pair of neighboring segments aseed is defined consisting of those pixels which belong to one of thesegments and at least one of its neighbors belongs to the other segment,and where at least parts of the other pixels in the images are assignedto that seed to which their distance is smallest
 11. The method fordepth ordering of parts of one or more digital images as set forth inclaim 7, wherein estimating relative motion of the borders of the dualsegments includes calculating optimal motion vectors for the dualsegments; computing match penalties for the dual segments; selecting acloser segment by comparing the optimal motion vectors.
 12. Computerprogram for enabling a processor to carry out the method for depthordering of parts of one or more digital images as set forth in claim 7.13. Tangible medium carrying the computer program as set forth in claim12.
 14. Specific hardware for enabling a processor to carry out themethod for depth ordering of parts of one or more digital images as setforth in claim
 7. 15. Reconfigurable hardware for enabling a processorto carry out the method for depth ordering of parts of one or moredigital images as set forth in claim 7.